首页> 外文OA文献 >Parallel Algorithms for Bayesian Networks Structure Learning with Applications in Systems Biology

【2h】

Parallel Algorithms for Bayesian Networks Structure Learning with Applications in Systems Biology

机译：贝叶斯网络结构学习的并行算法及其在系统生物学中的应用

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

The expression levels of thousands to tens of thousands of genes in a living cell are controlled by internal and external cues which act in a combinatorial manner that can be modeled as a network. High-throughput technologies, such as DNA-microarrays and next generation sequencing, allow for the measurement of gene expression levels on a whole-genome scale. In recent years, a wealth of microarray data probing gene expression under various biological conditions has been accumulated in public repositories, which facilitates uncovering the underlying transcriptional networks (gene networks). Due to the high data dimensionality and inherent complexity of gene interactions, this task inevitably requires automated computational approaches.Various models have been proposed for learning gene networks, with Bayesian networks (BNs) showing promise for the task. However, BN structure learning is an NP-hard problem and both exact and heuristic methods are computationally intensive with limited ability to produce large networks. To address these issues, we developed a set of parallel algorithms. First, we present a communication efficient parallel algorithm for exact BN structure learning, which is work-optimal provided that 2^n \u3e p.log(p), where n is the total number of variables, and p is the number of processors. This algorithm has space complexity within 1.41 of the optimal. Our empirical results demonstrate near perfect scaling on up to 2,048 processors. We further extend this work to the case of bounded node in-degree, where a limit d on the number of parents per variable is imposed. We characterize the algorithm\u27s run-time behavior as a function of d, establishing the range [n/3 - log(mn), ceil(n/2)) of values for d where it affects performance. Consequently, two plateaus regions are identified: for d \u3c n/3 - log(mn), where the run-time complexity remains the same as for d=1, and for d \u3e= ceil(n/2), where the run-time complexity remains the same as for d=n-1. Finally, we present a parallel heuristic approach for large-scale BN learning. This approach aims to combine the precision of exact learning with the scalability of heuristic methods. Our empirical results demonstrate good scaling on various high performance platforms. The quality of the learned networks for both exact and heuristic methods are evaluated using synthetically generated expression data. The biological relevance of the networks learned by the exact algorithm is assessed by applying it to the carotenoid biosynthesis pathway in Arabidopsis thaliana.

机译：活细胞中成千上万个基因的表达水平受内部和外部线索控制，这些线索以组合方式起作用，可以建模为网络。高通量技术（例如DNA微阵列和下一代测序）可用于测量全基因组规模的基因表达水平。近年来，在公共存储库中积累了许多在各种生物学条件下探测基因表达的微阵列数据，这有助于揭示潜在的转录网络（基因网络）。由于高数据维度和基因相互作用的内在复杂性，该任务不可避免地需要自动化的计算方法。已经提出了用于学习基因网络的各种模型，其中贝叶斯网络（BN）证明了该任务的前景。但是，BN结构学习是一个NP难题，并且精确方法和启发式方法都需要大量计算，且生成大型网络的能力有限。为了解决这些问题，我们开发了一组并行算法。首先，我们提出了一种用于精确BN结构学习的通信有效并行算法，该算法是工作最优的，条件是2 ^ n p.log（p），其中n是变量的总数，p是处理器的数量。该算法的空间复杂度在最佳值的1.41以内。我们的经验结果表明，在多达2,048个处理器上几乎可以完美缩放。我们将这项工作进一步扩展到有界节点度数的情况，其中对每个变量的父代数施加限制d。我们将算法的运行时行为描述为d的函数，建立了d的值范围[n / 3-log（mn），ceil（n / 2）），在此范围内它会影响性能。因此，确定了两个高原区域：对于d \ u3c n / 3-log（mn），运行时复杂度与对于d = 1相同，对于d \ u3e = ceil（n / 2），其中运行时复杂度与d = n-1相同。最后，我们提出了一种用于大规模BN学习的并行启发式方法。这种方法旨在将精确学习的精度与启发式方法的可扩展性相结合。我们的经验结果证明了在各种高性能平台上的良好扩展能力。使用合成生成的表达数据评估精确和启发式方法的学习网络质量。通过将其应用到拟南芥中的类胡萝卜素生物合成途径，可以评估通过精确算法学习的网络的生物学相关性。

著录项

作者
Nikolova, Olga;
展开▼
作者单位

展开▼
年度 2012
总页数
原文格式 PDF
正文语种 en
中图分类

相似文献

外文文献
中文文献
专利

1. Bayesian Network Constraint-Based Structure Learning Algorithms: Parallel and Optimized Implementations in the bnlearn R Package [J] . Marco Scutari Journal of Statistical Software . 2017,第1期

机译：基于贝叶斯网络约束的结构学习算法：bnlearn R包中的并行和优化实现
2. Monitoring and learning algorithms for dynamic hybrid Bayesian network in on-line system health management applications [J] . Iamsumang Chonlagarn, Mosleh Ali, Modarres Mohammad Reliability Engineering & System Safety . 2018,第OCTa期

机译：在线系统健康管理应用中动态混合贝叶斯网络的监视和学习算法
3. Parallel Simulated Annealing with a Greedy Algorithm for Bayesian Network Structure Learning [J] . Lee Sangmin, Kim Seoung Bum IEEE Transactions on Knowledge and Data Engineering . 2020,第6期

机译：贝叶斯网络结构学习贪婪算法并行模拟退火
4. Parallel Algorithms for Bayesian Networks Structure Learning with Applications to Systems Biology [C] . Nikolova Olga 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum . 2011

机译：贝叶斯网络结构学习的并行算法及其在系统生物学中的应用
5. Parallel algorithms for Bayesian networks structure learning with applications in systems biology [D] . Nikolova, Olga 2012

机译：贝叶斯网络结构与系统生物学应用的平行算法
6. The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo [O] . Richard Bonneau, David J Reiss, Paul Shannon, 2006

机译：The Inferelator：一种从系统生物学数据集重新学习简约监管网络的算法
7. Parallel Algorithms for Bayesian Networks Structure Learning with Applications in Systems Biology [O] . Olga Nikolova -1

机译：贝叶斯网络结构与系统生物学应用的平行算法
8. Stochastic Algorithms for Learning with Incomplete Data: An Application to Bayesian Networks [R] . Myers, J. W. 1999

机译：不完全数据学习的随机算法：贝叶斯网络的一种应用

Parallel Algorithms for Bayesian Networks Structure Learning with Applications in Systems Biology

摘要

著录项

相似文献

相关主题

期刊订阅